Hierarchical structures induce long-range dynamical correlations in written texts

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the nature of long-range letter correlations in texts

The origin of long-range letter correlations in natural texts is studied using random walk analysis and Jensen–Shannon divergence. It is concluded that they result from slow variations in letter frequency distribution, which are a consequence of slow variations in lexical composition within the text. These correlations are preserved by random letter shuffling within a moving window. As such, th...

متن کامل

On the origin of long-range correlations in texts

The complexity of human interactions with social and natural phenomena is mirrored in the way we describe our experiences through natural language. In order to retain and convey such a high dimensional information, the statistical properties of our linguistic output has to be highly correlated in time. An example are the robust observations, still largely not understood, of correlations on arbi...

متن کامل

Word-length entropies and correlations of natural language written texts

We study the frequency distributions and correlations of the word lengths of ten European languages. Our findings indicate that a) the word-length distribution of short words quantified by the mean value and the entropy distinguishes the Uralic (Finnish) corpus from the others, b) the tails at long words, manifested in the high-order moments of the distributions, differentiate the Germanic lang...

متن کامل

Hierarchical structures in Sturmian dynamical systems

We study hierarchical properties of Sturmian words. These properties are similar to those of substitution dynamical systems. This approach allows one to carry over to Sturmian dynamical systems methods developed in the context of substitutions. For example, it allows for a proof of an ergodic type theorem for additive functions taking values in a Banach space. We then focus on establishing vari...

متن کامل

Computer and Natural Language Texts - A Comparison Based on Long-Range Correlations

“Long range power low correlation” (LRC) is defined as a maximal propagation distance of the effect of some disturbance within a system found in many systems that can be represented as strings of symbols. LRC between characters has been identified also in natural language texts. The aim of this paper is to show that long range power law correlation can be also found in computer programs meaning...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the National Academy of Sciences

سال: 2006

ISSN: 0027-8424,1091-6490

DOI: 10.1073/pnas.0510673103